专利摘要:
AUDIO SYSTEM AND OPERATING METHOD FOR AN AUDIO SYSTEM An audio system comprises a receiver (301) for receiving an audio signal, such as an audio object or a channel signal from a spatial multi-channel signal. A binaural circuit (303) generates a binaural output signal when processing the audio signal. The processing is representative of a binaural transfer function that provides a virtual sound source position for the audio signal. A measurement circuit (307) that generates measurement data indicative of a characteristic of the acoustic environment and a determination circuit (311) determines an acoustic environment parameter in response to the measurement data. The acoustic environment parameter can typically be a reverb parameter, such as a reverb time. An adaptation circuit (313) dynamically adapts the binaural transfer function in response to the acoustic environment parameter. For example, the adaptation can modify a reverb parameter to more closely resemble the reverberation characteristics of the acoustic environment.
公开号:BR112013017070B1
申请号:R112013017070-0
申请日:2012-01-03
公开日:2021-03-09
发明作者:Arnoldus Werner Johannes Oomen;Dirk Jeroen Breebaart;Jeroen Gerardus Henricus Koppens;Erik Gosuinus Petrus Schuijers
申请人:Koninklijke Philips N.V;
IPC主号:
专利说明:

FIELD OF THE INVENTION
The invention relates to an audio system and an operation method, therefore and in particular, to the virtual spatial interpretation of audio signals. BACKGROUND OF THE INVENTION
Spatial sound reproduction in addition to simple stereo has become a common place for applications such as home theater systems. Typically, these systems use loudspeakers positioned in specific spatial positions. In addition, systems have been developed that provide a perception of spatial sound from headphones. Conventional stereo reproduction tends to provide sounds that are perceived 15 by originating from inside the user's head. However, systems have been developed that provide a complete spatial sound perception based on binaural signals supplied directly to the user's ears by headphones. These systems are generally referred to as virtual sound systems, since they provide a perception of virtual sound sources in positions where there is no real sound source.
Virtual surround sound is a technology that tries to create the perception that there are sound sources that surround the listener, that are not physically present. In these systems, the sound does not seem to originate from inside the user's head, as is known from conventional earphone reproduction systems. Preferably, the sound can be perceived by originating from outside the user's head, as is the case in natural hearing in the absence of headphones.
In addition to a more realistic experience, virtual surround audio also tends to have a positive effect on listener fatigue and speech intelligibility.
In order to achieve this perception, it is necessary to employ some means of copying the human auditory system to think that the sound is coming from the desired positions. A well-known approach to providing the virtual surround sound experience is the use of binaural recording. In these 5 approaches, the sound recording uses a dedicated microphone arrangement and is intended for playback using headphones. The recording is made either by placing microphones in an individual's ear canal or an artificial head, which is a bust that includes auricles (external ears). The use 10 of that artificial head including auricles provides a spatial impression very similar to the impression that the person listening to the recordings would have if he were present during the recording. However, because each person's auricles are exclusive and the filtration they impose on the sound 15 depends on the directional incidence of the incoming sound wave, it is likewise also exclusive, the location of sources is dependent on the individual. Certainly, the specific aspects used to locate sources are learned for each person in early childhood. Therefore, any incompatibility between auricles used during recording and those of the listener can lead to degraded perception and erroneous spatial impressions.
By measuring the impulse response of a sound source at a specific location in three-dimensional space to the 25 microphones in the ears of the artificial head for each individual, the so-called Head-Related Impulse Responses (HRIR) can be determined. HRIRs can be used to create a binaural recording that simulates multiple sources in different locations. This can be accomplished by wrapping each sound source with the pair of HRIRs which corresponds to the position of the sound source. HRIR can also be referred to as the Head Related Transfer Function (HRTF). Therefore, HRTF and HRIR are equivalent. In the case where HRIR also includes an environmental effect, they are referred to as Biauricular Environment Impulse Responses (BRIRs). BRIRs consist of an anechoic part that depends only on the individual's anthroprometric attributes (such as head size, ear shape, etc.), followed by a reverberating part that characterizes the combination of ambient and anthropometric properties.
The reverberating part contains two temporal regions, which generally overlap. The first region contains the 10 so-called anticipated reflections, which are isolated reflections from the sound source on the walls or obstacles within the environment, before reaching the tympanum (or measuring microphone). As the time interval increases, the number of reflections present in a fixed time interval now increases by 15, also containing reflections of a higher order.
The second region of the reverberating part is the part where these reflections are no longer isolated. This region is called the late or diffuse reverb end.
The reverberating part contains indicators that give the hearing system information about the distance from the source and the size and acoustic properties of the environment. In addition, this is subject-dependent due to the filtering of reflections with HRIRs. The energy of the reverberating part in relation to that anechoic part largely determines the perceived distance from the sound source. The density of reflections (anticipated) contributes to the size of the perceived environment. The T50 reverberation time is defined as the time it takes for reflections to drop 60 dB at the energy level. The reverberation time gives information about acoustic properties 30 of the environment; if your walls are very reflective (for example, bathroom) or if there is a lot of sound absorption (for example, room with furniture, carpet and curtains), as well as the volume (size) of the environment.
In addition to the use of measured impulse responses that incorporate a given acoustic environment, synthetic reverberation algorithms are generally employed, due to the ability to modify certain properties of the acoustic simulation and due to their relatively low computational complexity.
An example of a system that uses virtual surround techniques is MPEG Surround, which is one of the biggest advances in multi-channel audio coding, recently standardized in MPEG (ISO / IEC 23003-1: 2007., MPEG Surround) .MPEG Surround is a multi-channel audio encoding tool that allows existing handheld or stereo encoders to be extended to multiple channels. FIGURE 1 illustrates a block diagram of an extended stereo central encoder with MPEG Surround. First, the MPEG Surround encoder creates a stereo downmix of the multi channel input signal. The stereo downmix is encoded in a bit stream using a central encoder, for example, HE-AAC. Then, spatial parameters are estimated from the input signal of multiple channels. These parameters are encoded in a spatial bit stream. The resulting central encoder bit stream and the spatial bit stream are mixed to create the overall MPEG Surround bit stream.
Typically, the spatial bit stream is contained in the auxiliary data portion of the central encoder bit stream. On the decoder side, the central and spatial bitstream are first separated. The central stereo bit stream is decoded in order to reproduce the stereo downmix. This 30 downmix next to the spatial bit stream is inserted into the MPEG Surround decoder. The spatial bit stream is decoded, resulting in spatial parameters. The spatial parameters are then used to upmix the stereo downmix in order to obtain the multi-channel output signal which is an approximation of the original multi-channel input signal.
Since the spatial image of the multichannel input signal is parameterized, MPEG Surround also allows the decoding of the same multichannel bit stream on interpretation devices other than a multichannel speaker configuration. One example is virtual playback on headphones, which is referred to as the MPEG Surround binaural decoding process. In this mode, a realistic surround experience can be provided using regular headphones.
FIGURE 2 illustrates a block diagram of the central stereo codec extended with MPEG Surround, where the output is decoded in a binaural way. The encoder process is identical to that of FIGURE 1. After decoding the stereo bit stream, the spatial parameters are combined with the HRTF / HRIR data to produce the so-called binaural output.
Based on the concept of MPEG Surround, MPEG standardized a 'Spatial Audio Object Coding' (SAOC) (ISO / IEC 23003-2: 2010, Spatial Audio Object Coding).
From a high-level perspective, in SAOC, instead of channels, sound objects are efficiently coded. Although in MPEG Surround, each speaker channel can be considered to originate from a different mix of sound objects, in SAOC, these individual sound objects are, to some extent, available in the decoder for interactive manipulation. Similar to MPEG Surround, a mono, or stereo downmix is also created in SAOC where the downmix is encoded using a standard downmix encoder, such as HE-AAC. The object parameters are encoded and incorporated into the auxiliary data parts of the encoded downmix bit stream. On the decoder side, by manipulating these parameters, the user can control several aspects of the individual objects, such as position, amplification / attenuation, equalization, and also applies effects such as distortion and reverberation.
The quality of virtual surround interpretation of stereo or multi-channel content can be significantly improved by the so-called ghost materialization, as described in Breebaart, J., Schuijers, E. (2008.). "Phantom materialization: A novel method to enhance stereo audio reproduction on headphones." IEEE Trans. On Audio, Speech and Language processing 16, 1503-1511.
Rather than building a virtual stereo signal by assuming two sound sources that originate from the virtual speaker positions, the ghost materialization approach decomposes the sound signal into a directional signal component and an indirect / uncorrelated signal component. The direct component is synthesized by simulating a virtual speaker in the phantom position. The indirect component is synthesized by simulating virtual loudspeakers in the virtual direction (s) of the diffuse sound field. The ghost materialization process has the advantage that it does not impose the limitations of a loudspeaker configuration in the virtual interpretation scenario.
It has been found that the reproduction of virtual spatial sound provides very attractive special experiences in many scenarios. However, it was also discovered that the approach can, in some scenarios, result in experiments that do not completely correspond to the spatial experience that would result in a real world scenario with real sound sources in simulated positions in three-dimensional space.
It has been suggested that the spatial perception of virtual audio interpretation may be affected by interference in the brain between the position indicators provided by the audio and position indicators provided by the user's view.
On a daily basis, visual indicators are (typically subconsciously) combined with audible indicators to improve spatial perception. An example is a person's intelligibility increases when the movement of their lips can also be observed. In another example, it was found that a person can be tricked into providing a visual indicator to support a virtual sound source, for example, by placing an artificial speaker in a location where a virtual sound source is generated. The visual indicator, therefore, will enhance or modify virtualization. A visual indicator can, to a certain extent, also change the perceived location of a sound source as in the case of a ventriloquist. Conversely, the human brain has a problem locating sound sources that do not have a visual support indicator (for example, in the synthesis of the wave field), which is, in fact, contradictory to human nature.
Another example is the leakage of external sound sources from the listener's environment that are mixed with the virtual sound sources generated by a headset-based audio system. Depending on the audio content and a user's location, the acoustic properties of the physical and virtual environments can differ considerably, resulting in ambiguity in relation to the listener's environment. These mixtures of acoustic environments can cause unnatural and unrealistic sound reproduction.
Still, there are many aspects related to the interaction with visual indicators that are not well understood and, in fact, the effect of visual indicators in relation to the reproduction of virtual space is not completely understood.
Thus, an improved audio system would be advantageous and, in particular, an approach that allows for increased flexibility, easier implementation, easier operation, improved spatial user experience, enhanced virtual special sound generation and / or improved performance would be advantageous. SUMMARY OF THE INVENTION
Likewise, the invention preferably seeks to reduce, alleviate or eliminate one or more of the disadvantages mentioned above, alone or in combination.
According to an aspect of the invention, the audio system according to claim 1 is provided.
The invention can provide an enhanced spatial experience. In many realizations, a more natural special experience can be perceived and the reproduction of the sound may seem less artificial. In fact, the characteristics of virtual sound can be adapted to be more aligned with other position indicators, such as visual indicators. A more realistic spatial sound perception can therefore be achieved with the user being provided with a virtual sound reproduction that seems more natural and with an improved externalization.
The audio signal can correspond to a single sound source and the processing of the audio signal can be such that the audio represented by the audio signal is interpreted from a desired virtual position for the sound source. The audio signal can, for example, correspond to a single audio channel (such as a sound channel in a surround sound system) or it can, for example, correspond to a single audio object. The audio signal can be specifically a single channel audio signal from a spatial multi channel signal. Each spatial signal can be processed to be interpreted so that it is perceived as originating from a given virtual position.
The audio signal can be represented by a time domain signal, a frequency domain signal and / or a parameterized signal (such as an encoded signal). As a specific example, the audio signal can be represented by data values in a time and frequency cutout format. In some embodiments, the audio signal may have position information associated with it. For example, an audio object may be provided with position information that indicates a desired sound source position for the audio signal.
In some scenarios, the position information can be provided as spatial upmix parameters. The system can be arranged to further adapt the binaural transfer function in response to the position information for the audio signal. For example, the system can select the binaural transfer function to provide a sound position indicator corresponding to the indicated position.
The binaural output signal may comprise signal components of a plurality of audio signals, 20 each of which may have been processed according to a binaural transfer function, where the binaural transfer function for each audio signal may correspond to the desired position for that audio signal. Each of the binaural transfer functions can, in 25 many embodiments, be adapted in response to the acoustic environment parameter.
The processing can specifically apply the binaural transfer function to the audio signal or a signal derived from it (for example, by amplification, processing etc.j. The relationship between the binaural output signal and the audio signal is dependent on / reflected by the binaural transfer function. The audio signal can specifically generate a signal component for the binaural output signal which corresponds to the application of a binaural transfer function to the audio signal. transfer applied to the audio signal to generate a binaural output signal that provides a perception of the audio source being in a desired position.The binaural transfer function can include a contribution from or correspond to an HRTF, HRIR or BRIR.
The binaural transfer function can be applied to the audio signal (or a signal derived therefrom) by applying the binaural transfer function in the time domain, in the frequency domain or as a combination of both. For example, the binaural transfer function can be applied to frequency and time clippings, for example, by applying a complex binaural transfer function value to each frequency and time clippings. In other examples, the audio signal can be filtered through a filter that implements the binaural transfer function.
According to an optional aspect of the invention, the acoustic environment parameter comprises a reverb parameter for the acoustic environment.
This can allow for a particularly advantageous adaptation of the virtual sound to provide an enhanced and typically more natural user experience of a sound system that uses virtual sound source positioning.
According to an optional aspect of the invention, the acoustic environment parameter comprises at least one of: a reverberation time; a reverberation energy 30 relative to a direct path energy; a frequency spectrum of at least part of an ambient impulse response; a modal density of at least part of an ambient impulse response; an echo density of at least part of an ambient impulse response; an interauricular coherence or correlation; a level of anticipated reflections; and an estimate of room size.
These parameters can allow a particularly advantageous adaptation of the virtual sound to provide an enhanced and typically more natural user experience of a sound system that uses virtual sound source positioning. In addition, the parameters can facilitate implementation and / or operation.
According to an optional aspect of the invention, the adaptation circuit is arranged to adapt a reverberation characteristic of the binaural transfer function.
This can allow a particularly advantageous adaptation of the virtual sound to provide an enhanced and typically more natural user experience of a sound system that uses virtual sound source positioning. The approach may allow for easier operation and / or implementation, since the reverb characteristics are particularly suitable for adaptation. The modification can be such that the processing is modified to correspond to a binaural transfer function with different reverberation characteristics.
According to an optional aspect of the invention, the adaptation circuit is arranged to adapt at least one of the following characteristics of the binaural transfer function: a reverberation time; a reverberation energy relative to a direct sound energy; a frequency spectrum of at least part of the binaural transfer function; a modal density of at least part of the binaural transfer function; an echo density of at least part of the binaural transfer function; an interauricular coherence or correlation; and a level of anticipated reflections of at least part of the binaural transfer function.
These parameters can allow a particularly advantageous adaptation of the virtual sound to provide an enhanced and typically more natural user experience of a sound system that uses virtual sound source positioning. In addition, the parameters can facilitate implementation and / or operation.
According to an optional aspect of the invention, the processing comprises a combination of a predetermined binaural transfer function and a variable binaural transfer function adapted in response to the acoustic environment parameter.
This can, in many scenarios, provide for easier and / or improved implementation and / or operation. The predetermined binaural transfer function and the variable binaural transfer function can be combined. For example, transfer functions can be applied to the audio signal in series or can be applied to the audio signal in parallel with the resulting signals being combined.
The predetermined binaural transfer function can be fixed and can be independent of the acoustic environment parameter. The variable binaural transfer function can be a transfer function for simulation of an acoustic environment.
According to an optional aspect of the invention, the adaptation circuit is arranged to dynamically update the binaural transfer function.
The dynamic update can be in real time. The invention can allow a system that automatically and continuously adapts the provision of sound to the environment to be used. For example, as a user who loads the audio system moves, the sound can automatically adapt the interpreted audio to match the specific acoustic environment, for example, to match the specific environment. The measuring circuit can continuously measure the environment characteristic and the processing can be updated continuously in response to this.
According to an optional aspect of the invention, the adaptation circuit is arranged to modify the binaural transfer function only when the environment characteristic meets a criterion.
This can provide an improved user experience in many scenarios. In particular, it can, in many realizations, provide a more stable experience. The adaptation circuit can, for example, only modify a characteristic of the binaural transfer function when 15 the audio environment parameter meets a criterion. The criterion may, for example, be that a difference between the value of the acoustic environment parameter and the previous value used to adapt the binaural transfer function exceeds a limit.
According to an optional aspect of the invention, the adaptation circuit is arranged to restrict a transition rate for the binaural transfer function.
This can provide an enhanced user experience 25 and can make adapting to specific environmental conditions less noticeable. Modifications of the binaural transfer function can be made, subject to a low-pass filtration effect with attenuation of changes generally above, advantageously, 1 Hz.
For example, step changes for the binaural transfer function can be restricted to be gradual transitions lasting about 1-5 seconds.
According to an optional aspect of the invention, the audio system further comprises: a data store for storing binaural transfer function data; a circuit for retrieving binaural transfer function data from the data store in response to the acoustic environment parameter; and wherein the adaptation circuit is arranged to adapt the binaural transfer function in response to the recovered binaural transfer function data.
This can provide a particularly efficient implementation in many scenarios. The approach can specifically reduce computational resource requirements.
In some embodiments, the audio system may also comprise a circuit to detect that no binaural transfer function data stored in the data store is associated with the acoustic environment characteristics corresponding to the acoustic environment parameter, and in response, to generate and store data of binaural transfer function in the data storage next to the data that characterize the associated acoustic environment.
According to an optional aspect of the invention, the audio system further comprises: a test signal circuit arranged to radiate a sound test signal in the acoustic environment; and wherein the measurement circuit is arranged to capture the sound signal received in the environment, the received audio signal comprising a signal component that originates from the radiated sound test signal; and the determination circuit is arranged to determine the acoustic environment parameter in response to the sound test signal.
This can provide a low complexity, yet precise and practical way of determining the acoustic environment parameter. The determination of the acoustic environment parameter can be specifically in response to a correlation between the received test signal and the audio test signal. For example, frequency and time characteristics can be compared and used to determine the acoustic environment parameter.
According to an optional aspect of the invention, the determination circuit is arranged to determine an ambient pulse response in response to the received sound signal and to determine the acoustic environment parameter in response to the ambient impulse response.
This can provide a particularly robust, low complexity and / or accurate approach for determining the acoustic environment parameter.
According to an optional aspect of the invention, the adaptation circuit is further arranged to update the binaural transfer function in response to a user's position.
This can provide a particularly attractive user experience. For example, the virtual sound interpretation can be updated continuously as the user 20 moves around, thereby providing a continuous adaptation not only, for example, to the environment, but also to the user's position in the environment.
In some embodiments, the acoustic environment parameter is dependent on a user's position.
This can provide a particularly attractive user experience. For example, the interpretation of virtual sound can be updated continuously as the user moves, providing, with this, a continuous adaptation not only, for example, to the environment, but also to the position of the user in the environment. As an example, the acoustic environment parameter can be determined from a measured impulse response that can change dynamically as the user moves within an environment. The user's position can be a user orientation or location.
According to an optional aspect of the invention, the binaural circuit comprises a reverberator; and the adaptation circuit is arranged to adapt a reverberation processing of the reverberator in response to the acoustic environment parameter.
This can provide a particularly practical approach for modifying the processing to reflect the modified binaural transfer functions. The reverberator can provide a particularly efficient approach to adapting the characteristics while still being simple enough to control. The reverberator can, for example, be a Jot reverberator, as, for example, described in J.-M. Jot and A. Chaigne, "Digital delay networks 15 for designing artificial reverberators", Audio Engineering Society Convention, Feb. 1991.
According to one aspect of the invention, an operating method for an audio system is provided, according to claim 14.
These and other aspects, characteristics and advantages of the invention will be apparent from and elucidated with reference to the embodiment (s) hereinafter described. BRIEF DESCRIPTION OF THE DRAWINGS
The realizations of the invention will be described, only 25 by way of example, with reference to the drawings, in which
Figure 1 illustrates a block diagram of an extended central stereo codec with MPEG Surround;
Figure 2 illustrates one. block diagram of a central stereo codec extended with MPEG Surround and providing a binaural output signal;
Figure 3 illustrates an example of elements of an audio system, according to some embodiments of the invention;
Figure 4 illustrates an example of elements of a binaural processor, according to some embodiments of the invention;
Figure 5 illustrates an example of elements of a binaural signal processor, according to some embodiments of the invention;
Figure 6 illustrates an example of elements of a binaural signal processor, according to some embodiments of the invention; and
Figure 7 illustrates an example of elements from a Jot reverberator. DETAILED DESCRIPTION OF SOME ACCOMPLISHMENTS OF THE INVENTION
Figure 3 illustrates an example of an audio system, according to some embodiments of the invention. The audio system is a virtual sound system that emulates spatial sound source positions by generating a binaural signal that comprises a signal for each user's ear. Typically, binaural audio is provided to the user through a pair of headphones, or the like.
The audio system comprises a receiver 301 that receives an audio signal that must be interpreted by the audio system. The audio signal is intended to be interpreted as a sound source with a desired virtual position. Thus, the audio system interprets the audio signal, so that the user (at least approximately) perceives the signal as originating from the desired position or at least direction.
In the example, the audio signal is therefore considered to correspond to a single audio source. As such, the audio signal is associated with a desired position. The audio signal can correspond, for example, to a spatial channel signal and specifically the audio signal can be a single signal from a spatial multi-channel signal. Such a signal can implicitly have a desired associated position. For example, a central channel signal is associated with a position directly in front of the listener, a left front channel is associated with a position to the front and left of the listener, a left rear signal is associated with a position beyond and to the left listener etc. The audio system can therefore interpret the signal to appear to arrive from that position.
As another example, the audio signal can be one. 10 audio object and can, for example, become an audio object that the user can freely position in (virtual) space. Thus, in some examples, the desired position can be generated or selected locally, for example, by the user.
The audio signal can, for example, be represented provided and / or processed as a time domain signal. Alternatively or additionally, the audio signal can be provided and / or processed as a frequency domain signal. In fact, in many systems, the audio system may be able to switch between these representations and 20 apply processing in the domain that is most efficient for the specific operation.
In some embodiments, the audio signal can be represented as a clipping signal of frequency and time. Thus, the signal can be divided into two cutouts where each cutout corresponds to a time interval and a frequency interval. For each of these clippings, the sign can be represented as a set of values. Typically, a single complex signal value is provided for each cut-off time and frequency.
In the description, a single audio signal is described and processed to be interpreted from a virtual position. However, it will be appreciated that, in most instances, the sound interpreted to the listener comprises sounds from many different sound sources. Thus, in typical embodiments, a plurality of audio signals are received and interpreted, typically from different virtual positions. For example, for a virtual surround sound system, typically, a spatial multi-channel signal is received. In these scenarios, each signal is typically processed individually, as described below for the single audio signal and are then combined. In fact, different signals are typically interpreted from 10 different positions and, therefore, different binaural transfer positions can be applied.
Similarly, in many embodiments, a large number of audio objects can be received and each one (or a combination of them) can be individually processed, as described.
For example, it is possible to interpret a combination of objects or signals with a combination of binaural transfer functions, so that each object in the combination of objects is interpreted differently, for example, in different locations. In some scenarios, a combination of audio objects or signals can be processed as a combined entity. For example, the downmix of the front and left surround channels can be interpreted with a binaural transfer function which consists of a weighted mixture of the two corresponding binaural transfer functions.
The output signals can then be simply generated by combining (for example, adding) the binaural signals generated for each of the different audio signals.
Thus, although the following description focuses on a single audio signal, it can be considered merely the signal component of an audio signal that corresponds to a sound source of a plurality of audio signals.
The receiver 301 is coupled to a binaural processor 303 which receives the audio signal and which generates the binaural output signal when processing the audio signal. The binaural processor 303 is coupled to a pair of headphones 305 that is fed with the binaural signal. Thus, the binaural signal comprises a signal for the left ear and a signal for the right ear.
It will be appreciated that, although the use of headphones 10 may be typical for many applications, the described invention and principles are not limited to this. For example, in some situations, the sound can be interpreted through loudspeakers in front of the user or on the sides of the user (for example, using a 15-bounce mounting device). In some scenarios, binaural processing can, in these cases, be enhanced with additional processing that compensates for the dystonia between the two speakers (for example, it can compensate for the signal from the right speaker in relation to the components of the left speaker that 20 are also heard by the right ear).
The binaural processor 303 is arranged to process the audio signal processing, so that the processing is representative of a binaural transfer function that provides a virtual sound source position for the audio signal in the binaural output signal. In the system of FIGURE 3, the binaural transfer function is the transfer function applied to the audio signal to generate the binaural output signal. This therefore reflects the combined effect of 303 binaural processor processing and may, in some embodiments, include non-linear effects, feedback effects, etc.
As part of the processing, the binaural processor 303 can apply a position that virtualizes the binaural transfer function to the signal being processed. Specifically, as part of the signal path from the audio signal to the binaural output signal, a position 5 that virtualizes the binaural transfer function is applied to the signal.
The binaural transfer function specifically includes a Head Related Transfer Function (HRTF), a Head Related Impulse Response (HRIR) and / or Impulse Responses
Biauricular environments (BRIRs). The terms impulse response and transfer function are considered equivalent. Thus, the binaural output signal is generated to reflect the audio conditioning introduced by the listeners' head 15 and typically the environment, so that the audio signal appears to originate at the desired position.
Figure 4 illustrates an example of the 303 binaural processor in more detail. In the specific example, the audio signal is fed into a binaural 401 signal processor 401 which proceeds to filter the audio signal, according to the binaural transfer function. The binaural signal processor 401 comprises two subfilters, namely, one for generating the signal for the left ear channel and one for generating the signal for the right ear channel. In the example in Figure 4, the binaural signal generated is fed into a 403 amplifier that amplifies the left and right signals independently and then feeds them to the left and right speakers of the 305 headphones respectively.
The filter characteristics for the 401 binaural signal processor depend on the desired virtual position for the audio signal. In the example, the binaural processor 303 comprises a coefficient processor 405 that determines the filter characteristics and feeds them to the binaural signal processor 401. The coefficient processor 405 can specifically receive a position indication and select the appropriate filter components from the 5 the same way.
In some embodiments, the audio signal may, for example, be a time domain signal and the binaural signal processor 401 may be a time domain filter, such as an IIR or FIR filter. In this scenario, the 10 coefficient 405 processor can, for example, provide the filter coefficients. As another example, the audio signal can be converted to the frequency domain and filtration can be applied to the frequency domain, for example, by multiplying each frequency component by a complex value corresponding to the filter's frequency transfer function. . In some embodiments, processing can be done entirely in time and frequency cuts.
It will be appreciated that, in some embodiments, other processing may also be applied to the audio signal, for example, high pass filtration or low pass filtration may be applied. It will also be appreciated that binaural processing of virtual sound positioning can be combined with other processing. For example, an audio signal upmixing operation in response to spatial parameters can be combined with binaural processing. For example, for an MPEG Surround signal, an input signal represented by frequency and time clippings can be converted upwards to different spatial signals by applying different spatial parameters. Thus, for a given upmixed signal, each time and frequency cut can be multiplied by a complex value corresponding to the spatial / upmixing parameter, the resulting signal can then be subjected to binaural processing by multiplying each time cut and frequency by a complex value corresponding to the binaural transfer function. In fact, in some embodiments, these operations can be combined, so that each cut of time and frequency can be multiplied by a single complex value that represents both upmixing and binaural processing (specifically, this can correspond to the multiplication of the two values separate complexes).
In conventional binaural virtual space audio, binaural processing is based on predetermined binaural transfer functions that are derived by measurements, typically using microphones positioned in the ears of a simulator. For HRTFs and HRIRs, only the impact of the user, not the environment, is taken into account. However, when BRIRs are used, the characteristics of the environment in the environment in which the measurement was taken are also included. This can provide an improved user experience in many scenarios. In fact, it has been found that when virtual surround audio from headphones is played in the environment where measurements were taken, a convincing externalization can be achieved. However, in other environments and, in particular, in environments in which the acoustic characteristics are very different (that is, when there is a clear incompatibility between the reproduction and the measurement environment), the perceived externalization can degrade significantly.
In the system in Figure 3, this degradation is significantly mitigated and reduced by adapting binaural processing.
Specifically, the audio system of Figure 3 further comprises a measurement circuit 307 which actual global measurement is dependent on or reflects the acoustic environment in which the system is used. Thus, the measuring circuit 307 generates measurement data that is indicative of a characteristic of the acoustic environment.
In the example, the system is coupled to a microphone 309 that captures audio signals, but it will be appreciated that, in other embodiments, other sensors and other modalities may additionally or alternatively be used.
The measurement circuit 307 is coupled to a parameter processor 311 that receives the measurement data and proceeds to generate an acoustic environment parameter in response to this. Thus, a parameter is generated, which is indicative of the specific acoustic environment in which the virtual sound is interpreted. For example, the parameter can indicate 15 how echoic or reverberating the environment is.
The parameter processor 311 is coupled to an adaptation processor 313 which is arranged to adapt the binaural transfer function used by the binaural processor 303 depending on the determined acoustic environment parameter. For example, If the parameter is indicative of a very reverberant environment, the binaural transfer function can be modified to reflect a greater degree of reverberation than that measured by BRIR.
Thus, the system in Figure 3 is capable of adapting the virtual sound interpreted to more closely reflect the audio environment in which it is used. This can provide a more consistent and natural looking virtual sound supply. In particular, visual position indicators can be allowed to align more closely with the 30 audio position indicators provided.
The system can dynamically update the binaural transfer function and this dynamic update can, in some embodiments, be performed in real time. For example, the measurement processor 307 can continuously perform measurements and generate current measurement data. This can be reflected in a continuously updated acoustic environment parameter and a continuously updated adaptation of the binaural transfer function. Thus, the binaural transfer function can be continuously modified to reflect the current audio environment.
This can provide a very attractive user experience. As a specific example, a bathroom tends to be dominated by very hard and acoustically very reflective surfaces with little attenuation. On the contrary, a quarter tends to be dominated by soft and attenuating surfaces, in particular, for higher frequencies. Thus, a person using a pair of headphones providing virtual surround sound, with the system of Figure 3, will be able to be provided with a virtual sound that adjusts automatically when the user walks from the bathroom to the room or vice versa. versa. Thus, when the user leaves the bathroom and enters the room, the sound can automatically become less reverberating and echoic to reflect the new acoustic environment.
It will be appreciated that the exact acoustic environment parameter used may depend on the preferences and needs of the individual performance. However, in many embodiments, it may be particularly advantageous that the acoustic environment parameter comprises a reverb parameter for the acoustic environment.
In fact, reverb is not only a feature that can be measured relatively accurately using approaches of relatively low complexity, but it is also a feature that has a particularly significant impact on the user's audio perception and, in particular, on the perception user space. Thus, in some embodiments, the binaural transfer function is adapted in response to a reverb parameter for the audio environment.
It will be appreciated that the specific measurement and the parameters measured will also depend on the specific needs or preferences of the individual achievement. In the several advantageous examples below of the parameter of acoustic environment and methods of generation of them will be described.
In some embodiments, the acoustic environment parameter may comprise a parameter indicating a reverberation time for the acoustic environment. The reverberation time can be defined as the time it takes for reflections to be reduced to a specific level. For example, the reverb time can be determined as the time it takes for the level of energy reflections to drop to 60 dB. This value is typically denoted by T6o-
The T'6Q reverberation time can be, for example, determined by:
where V is the volume of the environment and a is an estimate of the equivalent absorption area.
In some embodiments, predetermined features of the environment (such as V and a) can be known for several different environments. The audio system can have several of these parameters stored (for example, after a user manually enters the values). The system can then proceed to perform measurements that simply determine in which environment the user is currently located. The corresponding data can then be retrieved and used to calculate the reverberation time. The determination of the environment can be by comparing audio characteristics to the measured and stored audio characteristics in each environment. As another example, a camera can capture an image of the environment and use that to select the data that is to be recovered. As yet another example, the measurement can include a position estimate and the data suitable for the environment corresponding to that position can be retrieved. In yet another example, the user's preferred acoustic interpretation parameters are associated with location information derived from GPS cells, proximity to specific WiFi access points, or a light sensor that discriminates between artificial or natural light to determine whether the user is inside or outside a building.
As another example, the reverberation time can be determined by the specific processing of two microphone signals, as described in more detail in Vesa, S., Harma, A. (2005). Automatic estimation of reverberation time from bianural signals. ICASSP 2005, p. iii / 281-iii / 284 March 18-23.
In some embodiments, the system can determine an impulse response for the acoustic environment. The impulse response can then be used to determine the acoustic environment parameter. For example, the pulse can be evaluated to determine the duration before the level of the impulse response is reduced to a certain level, for example, the value of T60 is determined as the duration of the impulse response until the response drops to 60 dB.
It will be appreciated that any suitable approach for determining the impulse response can be used.
For example, the system may include a circuit that generates a sound test signal that is radiated to the acoustic environment. For example, headphones can contain an external speaker or another speaker unit can be used, for example.
The microphone 309 can then monitor the audio environment and the impulse response is generated from the captured microphone signal. For example, a very short pulse can be radiated. This signal will be reflected to generate echoes and reverberation. Thus, the test signal can approximate a Dirac pulse, and the signal captured by the microphone may likewise, in some scenarios, directly reflect the impulse response. This approach may be particularly suitable for very quiet environments where interference from other audio sources is not present. In other scenarios, the test signal can be a known signal (such as a pseudo noise signal) and the microphone signal can be correlated to the test signal to generate the impulse response.
In some embodiments, the acoustic environment parameter may comprise an indication of a reverberation energy relative to a direct path energy. For example, for a measured BRIRh [n] (sampled discreetly), the direct sound energy to reverberate the proportion of energy R can be determined as:
where T is an adequate limit to discriminate between direct and reverberant sound (typically 5-50 ms).
In some embodiments, the acoustic environment parameter may reflect the frequency spectrum of at least part of an ambient impulse response. For example, the impulse response can be transformed in the frequency domain, for example, using an FFT, and the resulting frequency spectrum can be analyzed.
For example, a modal density can be determined. One mode corresponds to a resonance or standing wave effect for audio in the environment. Modal densities can likewise be detected from peaks in the frequency domain. The presence of these modal densities can have an impact on the sounds of the environment, and, thus, the detection of modal densities can be used to provide a corresponding impact on the interpreted virtual sound.
It will be appreciated that in other scenarios, a modal density can, for example, be calculated from the characteristics of the environment and using well-known formulas. For example, modal densities can be calculated from knowing the size of the environment.
Specifically, the modal density can be calculated as:
where c is the speed of sound and f the frequency.
In some embodiments, an echo density can be calculated. The echo density reflects how much and how closely together the echoes are in the environment. For example, in a small bathroom, it tends to have a relatively high number of relatively close echoes, while in a large room, it tends to have a smaller number of echoes that are not close together (and not as potent). This parameter echo density can therefore be advantageously used to adapt the virtual sound interpretation and can be calculated from the measured pulse response.
The echo density can be determined from the impulse response or it can, for example, be calculated from the characteristics of the environment using well-known formulas. For example, the temporal echo density can be calculated as:
where t is the time interval.
In. some achievements, it may be advantageous 5 to simply assess the level of anticipated reflections. For example, a short pulse test signal can be radiated and the system can determine the combined signal level of the microphone signal over a given time interval, such as 50 msec. after the transmission of the impulse.
The energy received in that time interval provides a very useful measure, yet of low complexity of the significance of anticipated echoes.
In some embodiments, the acoustic environment parameter can be determined to reflect interauricular coherence / correlation. The coherence / correlation between the two ears can, for example, be determined from signals from the two microphones positioned in the left and right auricles respectively. The correlation between the ears can reflect the diffusion and can provide a particularly advantageous basis for altering the interpreted virtual sound, since the diffusion gives an indication of how reverberating the environment is. A reverberating environment will be more diffuse than an environment with little or no reverberation.
In some embodiments, the acoustic environment parameter 25 can be or comprise simply an estimate of room size. In fact, as can be clearly seen from the previous examples, the size of the environment has a significant effect on the characteristics of the sound environment. In particular, echoes and reverberations depend heavily on this. Therefore, in some scenarios, the adaptation of the interpreted sound may simply be based on a determination of an ambient size based on a measurement.
It will be appreciated that approaches other than the determination of the ambient impulse response can be used. For example, the measurement system can alternatively or additionally use other modalities, 5 such as vision, light, radar, ultrasound, laser, camera or other sensory measurements. These modalities may be particularly suitable for estimating the size of the environment from which the reverberation characteristics can be determined. As another example, they may be suitable for estimating reflection characteristics (for example, the frequency response of wall reflections). For example, a camera can determine that the environment corresponds to a bathroom environment and can, likewise, assume reflection characteristics corresponding to typical indented surfaces. As another example, information about absolute and relative locations can be used.
Yet, as another example, a determination of ultrasound variation based on ultrasonic sensors and radiation from an ultrasonic test signal can be used 20 to estimate the size of the environment. In other embodiments, light sensors can be used to obtain an estimate based on the light spectrum (for example, assessing whether it detects natural or artificial light, thereby allowing a differentiation between an internal or external environment). Also, location information could be useful based on GPS. As another example, the detection and recognition of certain WiFi access points or GSM cell identifiers could be used to identify which binaural transfer function 30 to use.
Also, it will be appreciated that, although audio measurements can, in many embodiments, advantageously be based on the radiation of an audio test signal, some achievements may not use a test signal. For example, in some embodiments, the determination of audio characteristics, such as reverb, frequency response or an impulse response, can be done passively 5 by analyzing sounds that are produced by other sources in the current physical environment (for example, steps, radio etc).
In the system of FIGURE 3, the processing of the binaural processor 303 is then modified in response to the acoustic environment parameter. Specifically, the binaural signal processor 401 processes the audio signal, according to the binaural transfer function, where the binaural transfer function is dependent on the acoustic environment parameter.
In some embodiments, the binaural signal processor 401 may comprise a data store that stores binaural transfer function data corresponding to a plurality of different acoustic environments. For example, one or more BRIRs can be stored for several different types of environment, such as a typical bathroom, bedroom, living room, kitchen, salon, car, train, etc. For each type, a plurality of BRIRs can be stored corresponding to different room sizes. The characteristics of the environment in which the BRIR was measured are still stored for each BRIR.
The binaural signal processor 401 may further comprise a processor that is arranged to receive an acoustic environment parameter and, in response, to retrieve suitable binaural transfer function data from storage. For example, the acoustic environment parameter can be a composite parameter comprising an indication of room size, an indication of the ratio between early and late energy and a reverberation time. The processor can then search through the stored data to find the BRIR for which the stored environment characteristics most closely resemble the measured environment characteristics.
The processor then retrieves the best matching BRIR and applies it to the audio signal to generate the binaural signal that, after amplification, is fed to the headphones.
In some embodiments, data storage can be dynamically updated and / or developed. For example, when a user is in a new environment, the acoustic environment parameter can be determined and used to generate a BRIR that corresponds to that environment. The BRIR can then be used to generate the binaural output signal. However, in addition, the BRIR can be stored in the data store together with certain appropriate environmental characteristics, such as the acoustic environment parameter, possibly a position etc. In this way, data storage can be dynamically incorporated and enhanced with new data as and when it is generated. BRIR can then be used subsequently without having to determine that from the first principles. For example, when a user returns to an environment in which he previously used the device, this will be automatically detected and the stored BRIR is retrieved and used to generate the binaural output signal. Only if there is no suitable BRIR available, it will be necessary to generate a new one (which can then be stored). This approach can reduce the complexity and the processing resource.
In some embodiments, the binaural signal processor 401 comprises two signal processing blocks. A first block can perform the processing corresponding to a binaural transfer function with a predetermined / fixed virtual position. Thus, this block can process input signal, according to a reference BRIR, HRIR or HRTF that can be generated based on the 5 reference measurements, for example, during the system design. The second signal processing block can be arranged to perform environment simulation in response to the acoustic environment parameter. Thus, in this example, the general binaural transfer function includes a fixed or predetermined contribution of BRIR, HRIR or HRTF and to an adaptive environment simulation process. The approach can reduce complexity and make the project easier. For example, in many realizations, it is possible to generate a precise environment adaptation without the simulation environment processing considering the specific desired virtual positioning. Thus, the virtual positioning and adaptation of the environment can be separated with each individual signal processing block having to consider only these aspects.
For example, BRIR, HRIR or HRTF can be selected to match the desired virtual position. The resulting binaural signal can then be modified to have a reverb characteristic that corresponds to that of the environment. However, this modification can be considered 25 regardless of the specific position of the audio sources, so that only the acoustic environment parameter needs to be considered. This approach can significantly facilitate the simulation and adaptation of the environment.
Individual processing can be carried out in parallel or in series. FIGURE 5 illustrates an example where fixed HRTF processing 501 and variable adaptive environment simulation processing 503 are applied to the audio signal in parallel. The resulting signals are then combined by a simple simulation 505. FIGURE 6 illustrates an example where a fixed HRTF processing 601 and a variable adaptive environment simulation processing 603 are performed in series, so that the simulation processing of adaptive environment is applied to the binaural signal generated by HRTF processing. It will be appreciated that, in other embodiments, the order of processing may be reversed.
In some embodiments, it may be advantageous to apply fixed HRTF processing individually to each channel and apply variable adaptive environment simulation processing once, in a mix of all channels in parallel.
The binaural signal processor 401 can specifically attempt to modify the binaural transfer function, so that the audio system output binaural signal has characteristics that most closely resemble the characteristic (s) reflected by the acoustic environment parameter. For example, for an acoustic environment parameter that indicates a high reverb time, the reverb time of the generated binaural output signal is increased. In most realizations, a reverb characteristic is a particularly suitable parameter to adapt to provide a closer correlation between the virtual sound generated and the acoustic environment.
This can be achieved by modifying the environment simulation signal processing 503, 603 of the binaural signal processor 401.
In particular, the ambient simulation signal processing 503, 603 can, in many embodiments, comprise a reverberator that is adapted in response to the acoustic environment parameter.
The level of anticipated reflections can be controlled by adjusting the level of at least part of the impulse response of the reverberating part that includes the anticipated reflections related to the level of HRIR, HRTF or BRIR.
Thus, a synthetic reverberation algorithm can be controlled based on the estimated environment parameters.
Several synthetic reverberators are known and it will be appreciated that any suitable reverberator can be used.
FIGURE 7 presents a specific example of the ambient simulation signal processing block being implemented as a unit feedback network reverberator and, specifically, as a Jot reverberator.
The ambient simulation signal processing 503, 603 can proceed to adapt the parameters of the Jot reverberator to modify the characteristics of the binaural output signal. Specifically, it can modify one or more of the previously described characteristics for the acoustic environment parameter.
In fact, in the example of the Jot reverberator in FIGURE 7, the modal and echo densities can be modified by changing the relative and absolute values of the delays (mi). By adapting the value of gains in the feedback loops, the reverberation time can be controlled. In addition, a frequency-dependent Tgo can be controlled by replacing the gains with suitable filters (hi (z)).
For binaural reverberations, the outputs of the N branches can be combined in different ways (ai, βi), making it possible to generate two reverb ends with a correlation of 0. One pair of filters designed together (cl (z), c2 (z) ) can therefore be used to control the ICC of the two reverb outputs.
Another filter (tL (z), tR (z)) in the network, can be used to control the spectral equalization of the reverberation. Also, the general reverb gain can be incorporated into this filter, thereby allowing control over the ratio between the direct and the aside part of the reverberation, that is, of reverberation energy relative to a direct sound energy.
Additional details on the use of a Jot reverberator, specifically on the relationship between time density and frequency and reverberator parameters, and the translation of a Teo dependent on the desired frequency to the reverberator parameters, can be found in Jean-Marc Jot and Antoine Chaigne (1991) Digital delay networks for designing artificial reverberations, proc. 90th AES convention.
Additional details on the use of a Jot binaural reverberator and specifically on how to translate the desired interauricular coherence / correlation and coloration to the reverberator parameters can be found in Fritz Menzer and Christof Faller (2009) Binaural reverberation using a modified Jot reverberator with frequency -dependent interaural coherence matching, proc. 12th 6th AES convention.
In some embodiments, the acoustic environment parameter and the binaural transfer function can be dynamically modified to continuously adapt the interpreted sound to the acoustic environment. However, in other embodiments, the binaural transfer function can only be modified when the acoustic environment parameter meets a criterion. Specifically, the need may be that the acoustic environment parameter must differ by more than a certain limit from the acoustic environment parameter that was used to adjust the current processing parameters. Thus, in some embodiments, the binaural transfer function is only updated if the change in the environment characteristic (s) exceeds a certain level. This can, in many scenarios, provide an enhanced listening experience with a more static 5-sound interpretation.
In some embodiments, the modification of the binaural transfer function can be instantaneous. For example, if one. different reverb time is suddenly measured (for example, because the user has moved to a different environment), the system can immediately change the reverb time for the sound interpretation to match this. However, in other embodiments, the system may be arranged to restrict the rate of change and, therefore, to gradually modify the binaural transfer function 15. For example, the transition can be gradually implemented over a time interval of, say, 1-5 seconds. The transition can, for example, be achieved by an interpolation of the target values for the binaural transfer function, or it can, for example, be achieved by a gradual transition of the acoustic environment parameter value used to adapt the processing.
In some embodiments, the measured acoustic environment parameter and / or the corresponding processing parameters can be stored by the user later. For example, the user can subsequently select from previously determined values. This selection could also be carried out automatically, for example, by the system that detects that the characteristics of the current environment closely reflect the characteristics previously measured. This approach can be practical for scenarios where a user frequently moves into and out of an environment.
In some embodiments, the binaural transfer function is adapted by environment. In fact, the parameter of acoustic environment may reflect characteristics of the environment as a whole. The binaural transfer function is therefore updated to simulate the environment and provide a virtual spatial interpretation by taking into account the characteristics of the environment.
In some embodiments, the acoustic environment parameter may, however, not only reflect the acoustic characteristics for the environment, but may also reflect the user's position within the environment. For example, if a user is close to a wall, the proportion between early reflections and delayed reverberation may change and the acoustic environment parameter may reflect this. This can cause the binaural transfer function to be modified to provide a similar ratio between early reflections and delayed reverberation. Thus, as the user moves towards a wall, the direct anticipated echoes become more significant in the interpreted sound and the reverb end is reduced. When the user moves away from the wall, the opposite happens.
In some embodiments, the system may be arranged to update the binaural transfer function in response to a user's position. This can be done indirectly, as described in the example above.
Specifically, adaptation can occur indirectly by determining an acoustic environment parameter that is dependent on the user's position and, specifically, that is dependent on the user's position within an environment.
In some embodiments, a deposition parameter indicative of a user's position can be generated and used to adapt the binaural transfer function. For example, a camera can be installed and use visual detection techniques to locate a user in the environment. The corresponding position estimate can then be transmitted to the audio system (for example, using wireless communications) and can be used to adapt the binaural transfer function.
It will be appreciated that the above description, for clarity, described embodiments of the invention with reference to different functional circuits, units and processors. However, it will be apparent that any suitable distribution of functionality between different functional circuits, units or processors can be used without detracting from the invention. For example, the illustrated functionality to be performed by separate processors or controllers can be performed by the same processor or controllers. Thus, references to specific functional units or circuits should only be seen as references to adequate means for the provision of the described functionality, rather than being indicative of a strict logical or physical structure or organization.
The invention can be implemented in any suitable form, including hardware, software, firmware or any combination thereof. The invention can optionally be implemented, at least partially, as computer software that runs one or more data processors and / or digital signal processors. The elements and components of an embodiment of the invention can be physically, functionally and logically implemented in any suitable way. In fact, functionality can be implemented in a single unit, in a plurality of units, or as parts of other functional units. As such, the invention can be implemented in a single unit or it can be physically and functionally distributed among different units, circuits and processors.
Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. In contrast, the scope of the present invention is limited only by the 5 appended claims. In addition, although an aspect may appear to be described in connection with the particular embodiments, one skilled in the art will recognize that various aspects of the described embodiments can be combined according to the invention. In the claims, the term 10 comprising does not exclude the presence of other elements or steps.
In addition, although listed individually, a plurality of means, elements, circuits or method steps can be implemented, for example, by a single circuit, unit or processor. In addition, although individual aspects can be included in different claims, they can possibly be combined advantageously and inclusion in different claims does not imply that a combination of aspects is not feasible and / or advantageous. Also, the inclusion of an aspect in one category of claims does not imply a limitation to that category, but, on the contrary, it indicates that the aspect is equally applicable to other categories of claim, as appropriate. In addition, the order of the aspects in the 25 claims does not imply any specific order in which the aspects are to be worked on and, in particular, the order of the individual steps in a claim of the method does not imply that the steps must be performed in that order, on the contrary , the steps can be performed in any suitable order. In addition, references in singular do not exclude a plurality. Thus, references to "one", "one", "first (a)", "second (a)" etc. do not preclude a plurality. The reference signs in the claims are provided merely as an example of clarification and should not be construed as limiting the scope of the claims in any way.
权利要求:
Claims (14)
[0001]
1. AUDIO SYSTEM, characterized by comprising: a receiver (301) for receiving an audio signal; a binaural circuit (303) for generating a binaural output signal when processing the audio signal, the processing being representative of a binaural transfer function that provides a virtual sound source position for the audio signal, wherein said function binaural transfer corresponds to a Binaural Ambient Impulse Response; a measurement circuit (307) for generating measurement data indicative of a characteristic of an acoustic environment; a determination circuit (311) for determining an acoustic environment parameter in response to the measurement data; and an adaptation circuit (313) to adapt the binaural transfer function in response to the acoustic environment parameter, where the adaptation circuit (313) is arranged to dynamically update the binaural transfer function to correspond to the acoustic environment.
[0002]
2. AUDIO SYSTEM, according to claim 1, in which the parameter of acoustic environment is characterized by comprising a parameter of reverberation for the acoustic environment.
[0003]
3. AUDIO SYSTEM, according to claim 1, in which the parameter of acoustic environment is characterized by comprising at least one of: - a reverberation time; - a reverberation energy relative to a direct path energy; - a frequency spectrum of at least part of an ambient impulse response; - a modal density of at least part of an ambient impulse response; - an echo density of at least part of an ambient impulse response; - interauricular coherence or correlation; - a level of early thinking; and - an estimate of the size of the environment.
[0004]
4. AUDIO SYSTEM, according to claim 1, characterized in that the adaptation circuit (313) is arranged to adapt a reverberation characteristic of the binaural transfer function.
[0005]
5. AUDIO SYSTEM, according to claim 1, characterized in that the adaptation circuit (313) is arranged to adapt at least one of the following characteristics of the binaural transfer function: - a reverberation time; - a reverberation energy relative to a direct sound energy; - a frequency spectrum of at least part of the binaural transfer function; - a modal density of at least part of the binaural transfer function; - an echo density of at least part of the binaural transfer function; - interauricular coherence or correlation; and - a level of anticipated reflections of at least part of the binaural transfer function.
[0006]
6. AUDIO SYSTEM, according to claim 1, wherein the processing is characterized by comprising a combination of a predetermined binaural transfer function and a variable binaural transfer function adapted in response to the acoustic environment parameter.
[0007]
7. AUDIO SYSTEM, according to claim 1, characterized by the adaptation circuit (313) being arranged to modify the binaural transfer function only when the environment characteristic meets a criterion.
[0008]
8. AUDIO SYSTEM, according to claim 1, characterized in that the adaptation circuit is arranged to gradually modify over a period of time the binaural transfer function.
[0009]
9. AUDIO SYSTEM, according to claim 1, characterized in that it further comprises: a data storage to store binaural transfer function data; a circuit for retrieving binaural transfer function data from the data store in response to the acoustic environment parameter; and wherein the adaptation circuit is arranged to adapt the binaural transfer function in response to the recovered binaural transfer function data.
[0010]
10. AUDIO SYSTEM, according to claim 1, characterized by further comprising: a test signal circuit arranged to radiate a sound test signal in the acoustic environment; and wherein the measuring circuit (307) is arranged to capture a sound signal received in the environment, the audio signal received comprising a signal component that originates from the radiated sound test signal; and the determination circuit (311) is arranged to determine the acoustic environment parameter in response to the sound test signal.
[0011]
11. AUDIO SYSTEM, according to claim 10, characterized in that the determination circuit (311) is arranged to determine an ambient impulse response in response to the received sound signal and to determine the acoustic environment parameter in response to the impulse response environment.
[0012]
12. AUDIO SYSTEM, according to claim 1, characterized in that the adaptation circuit (313) is also arranged to update the binaural transfer function in response to a user position.
[0013]
13. AUDIO SYSTEM, according to claim 1, wherein the binaural circuit (303) is characterized by comprising a reverberator; and the adaptation circuit (313) is arranged to adapt a reverberation processing of the reverberator in response to the acoustic environment parameter.
[0014]
14. OPERATING METHOD FOR AN AUDIO SYSTEM, the method being characterized by understanding: reception of an audio signal; generation of a binaural output signal when processing the audio signal, the processing being representative of a binaural transfer function that provides a virtual sound source position for the audio signal, wherein said binaural transfer function corresponds to a Impulse Response Environment Biauriculars; generation of measurement data indicative of a characteristic of an acoustic environment; determination of an acoustic environment parameter in response to the measurement data; and adaptation of the binaural transfer function in response to the parameter of acoustic environment, said adaptation being arranged to dynamically update the binaural transfer function to correspond to the acoustic environment.
类似技术:
公开号 | 公开日 | 专利标题
BR112013017070B1|2021-03-09|AUDIO SYSTEM AND OPERATING METHOD FOR AN AUDIO SYSTEM
US10334380B2|2019-06-25|Binaural audio processing
JP6433918B2|2018-12-05|Binaural audio processing
JP4850948B2|2012-01-11|A method for binaural synthesis taking into account spatial effects
AU2001239516B2|2004-12-16|System and method for optimization of three-dimensional audio
CN106576203B|2020-02-07|Determining and using room-optimized transfer functions
US20060056638A1|2006-03-16|Sound reproduction system, program and data carrier
US20120201405A1|2012-08-09|Virtual surround for headphones and earbuds headphone externalization system
US10341799B2|2019-07-02|Impedance matching filters and equalization for headphone surround rendering
WO2014091375A1|2014-06-19|Reverberation processing in an audio signal
CN112005559B|2022-03-04|Method for improving positioning of surround sound
Laitinen2008|Binaural reproduction for directional audio coding
US20160044432A1|2016-02-11|Audio signal processing apparatus
Pelzer et al.2011|3D reproduction of room acoustics using a hybrid system of combined crosstalk cancellation and ambisonics playback
KR20210059758A|2021-05-25|Apparatus and method for applying virtual 3D audio to a real room
Tamulionis et al.2019|Listener Movement Prediction based Realistic Real-Time Binaural Rendering
同族专利:
公开号 | 公开日
US20130272527A1|2013-10-17|
JP2014505420A|2014-02-27|
TR201815799T4|2018-11-21|
CN103329576A|2013-09-25|
WO2012093352A1|2012-07-12|
RU2595943C2|2016-08-27|
RU2013136390A|2015-02-10|
EP2661912A1|2013-11-13|
CN103329576B|2016-12-07|
US9462387B2|2016-10-04|
JP5857071B2|2016-02-10|
EP2661912B1|2018-08-22|
BR112013017070A2|2019-04-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US4188504A|1977-04-25|1980-02-12|Victor Company Of Japan, Limited|Signal processing circuit for binaural signals|
DE4328620C1|1993-08-26|1995-01-19|Akg Akustische Kino Geraete|Process for simulating a room and / or sound impression|
JPH0787599A|1993-09-10|1995-03-31|Matsushita Electric Ind Co Ltd|Sound image moving device|
US5485514A|1994-03-31|1996-01-16|Northern Telecom Limited|Telephone instrument and method for altering audible characteristics|
JPH07288900A|1994-04-19|1995-10-31|Matsushita Electric Ind Co Ltd|Sound field reproducing device|
US6222927B1|1996-06-19|2001-04-24|The University Of Illinois|Binaural signal processing system and method|
JP2000330597A|1999-05-20|2000-11-30|Matsushita Electric Ind Co Ltd|Noise suppressing device|
AUPQ941600A0|2000-08-14|2000-09-07|Lake Technology Limited|Audio frequency response processing sytem|
JP2003009296A|2001-06-22|2003-01-10|Matsushita Electric Ind Co Ltd|Acoustic processing unit and acoustic processing method|
JP4171675B2|2003-07-15|2008-10-22|パイオニア株式会社|Sound field control system and sound field control method|
US7394903B2|2004-01-20|2008-07-01|Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V.|Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal|
CN100421152C|2004-07-30|2008-09-24|英业达股份有限公司|Sound control system and method|
GB0419346D0|2004-09-01|2004-09-29|Smyth Stephen M F|Method and apparatus for improved headphone virtualisation|
KR20060022968A|2004-09-08|2006-03-13|삼성전자주식회사|Sound reproducing apparatus and sound reproducing method|
JP2008513845A|2004-09-23|2008-05-01|コーニンクレッカフィリップスエレクトロニクスエヌヴィ|System and method for processing audio data, program elements and computer-readable medium|
US8204261B2|2004-10-20|2012-06-19|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.|Diffuse sound shaping for BCC schemes and the like|
WO2006126161A2|2005-05-26|2006-11-30|Bang & Olufsen A/S|Recording, synthesis and reproduction of sound fields in an enclosure|
WO2007076863A1|2006-01-03|2007-07-12|Slh Audio A/S|Method and system for equalizing a loudspeaker in a room|
WO2007080211A1|2006-01-09|2007-07-19|Nokia Corporation|Decoding of binaural audio signals|
CN101356573B|2006-01-09|2012-01-25|诺基亚公司|Control for decoding of binaural audio signal|
FR2899424A1|2006-03-28|2007-10-05|France Telecom|Audio channel multi-channel/binaural e.g. transaural, three-dimensional spatialization method for e.g. ear phone, involves breaking down filter into delay and amplitude values for samples, and extracting filter`s spectral module on samples|
US7957548B2|2006-05-16|2011-06-07|Phonak Ag|Hearing device with transfer function adjusted according to predetermined acoustic environments|
US7876903B2|2006-07-07|2011-01-25|Harris Corporation|Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system|
US20080147411A1|2006-12-19|2008-06-19|International Business Machines Corporation|Adaptation of a speech processing system from external input that is not directly related to sounds in an operational acoustic environment|
AU2008309951B8|2007-10-09|2011-12-22|Dolby International Ab|Method and apparatus for generating a binaural audio signal|
CN101184349A|2007-10-10|2008-05-21|昊迪移通技术有限公司|Three-dimensional ring sound effect technique aimed at dual-track earphone equipment|
JP2009206691A|2008-02-27|2009-09-10|Sony Corp|Head-related transfer function convolution method and head-related transfer function convolution device|
EP2258120B1|2008-03-07|2019-08-07|Sennheiser Electronic GmbH & Co. KG|Methods and devices for reproducing surround audio signals via headphones|
JP2008233920A|2008-03-28|2008-10-02|Sony Corp|Sound reproducing device and sound reproducing method|
JP5092974B2|2008-07-30|2012-12-05|富士通株式会社|Transfer characteristic estimating apparatus, noise suppressing apparatus, transfer characteristic estimating method, and computer program|
KR101313516B1|2008-07-31|2013-10-01|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.|Signal generation for binaural signals|
EP2337375B1|2009-12-17|2013-09-11|Nxp B.V.|Automatic environmental acoustics identification|EP2637427A1|2012-03-06|2013-09-11|Thomson Licensing|Method and apparatus for playback of a higher-order ambisonics audio signal|
TWI481892B|2012-12-13|2015-04-21|Ind Tech Res Inst|Pulse radar ranging apparatus and ranging algorithm thereof|
WO2014146668A2|2013-03-18|2014-09-25|Aalborg Universitet|Method and device for modelling room acoustic based on measured geometrical data|
CN104982042B|2013-04-19|2018-06-08|韩国电子通信研究院|Multi channel audio signal processing unit and method|
JP5998306B2|2013-05-16|2016-09-28|コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V.|Determination of room size estimation|
EP2830043A3|2013-07-22|2015-02-18|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Method for Processing an Audio Signal in accordance with a Room Impulse Response, Signal Processing Unit, Audio Encoder, Audio Decoder, and Binaural Renderer|
US9319819B2|2013-07-25|2016-04-19|Etri|Binaural rendering method and apparatus for decoding multi channel audio|
US10141004B2|2013-08-28|2018-11-27|Dolby Laboratories Licensing Corporation|Hybrid waveform-coded and parametric-coded speech enhancement|
US10469969B2|2013-09-17|2019-11-05|Wilus Institute Of Standards And Technology Inc.|Method and apparatus for processing multimedia signals|
CN103607669B|2013-10-12|2016-07-13|公安部第三研究所|A kind of building conversational system audio transmission characteristic detecting method and detecting system|
EP3062535B1|2013-10-22|2019-07-03|Industry-Academic Cooperation Foundation, Yonsei University|Method and apparatus for processing audio signal|
CN104661169B|2013-11-25|2018-11-06|深圳中电长城信息安全系统有限公司|A kind of audio testing method and device|
KR20210094125A|2013-12-23|2021-07-28|주식회사 윌러스표준기술연구소|Method for generating filter for audio signal, and parameterization device for same|
WO2015102920A1|2014-01-03|2015-07-09|Dolby Laboratories Licensing Corporation|Generating binaural audio in response to multi-channel audio using at least one feedback delay network|
CN104768121A|2014-01-03|2015-07-08|杜比实验室特许公司|Generating binaural audio in response to multi-channel audio using at least one feedback delay network|
CN105900457B|2014-01-03|2017-08-15|杜比实验室特许公司|The method and system of binaural room impulse response for designing and using numerical optimization|
CN107770718B|2014-01-03|2020-01-17|杜比实验室特许公司|Generating binaural audio by using at least one feedback delay network in response to multi-channel audio|
US9866986B2|2014-01-24|2018-01-09|Sony Corporation|Audio speaker system with virtual music performance|
EP3122073A4|2014-03-19|2017-10-18|Wilus Institute of Standards and Technology Inc.|Audio signal processing method and apparatus|
KR101856540B1|2014-04-02|2018-05-11|주식회사 윌러스표준기술연구소|Audio signal processing method and device|
DE102014210215A1|2014-05-28|2015-12-03|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Identification and use of hearing room optimized transfer functions|
US10003886B2|2014-06-30|2018-06-19|Uri El Zur|Systems and methods for adaptive noise management|
WO2016002358A1|2014-06-30|2016-01-07|ソニー株式会社|Information-processing device, information processing method, and program|
WO2016009863A1|2014-07-18|2016-01-21|ソニー株式会社|Server device, and server-device information processing method, and program|
EP3197182B1|2014-08-13|2020-09-30|Samsung Electronics Co., Ltd.|Method and device for generating and playing back audio signal|
JP2018509864A|2015-02-12|2018-04-05|ドルビー ラボラトリーズ ライセンシング コーポレイション|Reverberation generation for headphone virtualization|
EA202090186A3|2015-10-09|2020-12-30|Долби Интернешнл Аб|AUDIO ENCODING AND DECODING USING REPRESENTATION CONVERSION PARAMETERS|
US9734686B2|2015-11-06|2017-08-15|Blackberry Limited|System and method for enhancing a proximity warning sound|
US10614819B2|2016-01-27|2020-04-07|Dolby Laboratories Licensing Corporation|Acoustic environment simulation|
CN109076305B|2016-02-02|2021-03-23|Dts(英属维尔京群岛)有限公司|Augmented reality headset environment rendering|
US9826332B2|2016-02-09|2017-11-21|Sony Corporation|Centralized wireless speaker system|
US9924291B2|2016-02-16|2018-03-20|Sony Corporation|Distributed wireless speaker system|
US10142755B2|2016-02-18|2018-11-27|Google Llc|Signal processing methods and systems for rendering audio on virtual loudspeaker arrays|
US9591427B1|2016-02-20|2017-03-07|Philip Scott Lyren|Capturing audio impulse responses of a person with a smartphone|
US9826330B2|2016-03-14|2017-11-21|Sony Corporation|Gimbal-mounted linear ultrasonic speaker assembly|
US9881619B2|2016-03-25|2018-01-30|Qualcomm Incorporated|Audio processing for an acoustical environment|
US9906851B2|2016-05-20|2018-02-27|Evolved Audio LLC|Wireless earbud charging and communication systems and methods|
GB201609089D0|2016-05-24|2016-07-06|Smyth Stephen M F|Improving the sound quality of virtualisation|
US9584946B1|2016-06-10|2017-02-28|Philip Scott Lyren|Audio diarization system that segments audio input|
US9794724B1|2016-07-20|2017-10-17|Sony Corporation|Ultrasonic speaker assembly using variable carrier frequency to establish third dimension sound locating|
US9924286B1|2016-10-20|2018-03-20|Sony Corporation|Networked speaker system with LED-based wireless communication and personal identifier|
US9854362B1|2016-10-20|2017-12-26|Sony Corporation|Networked speaker system with LED-based wireless communication and object detection|
US10075791B2|2016-10-20|2018-09-11|Sony Corporation|Networked speaker system with LED-based wireless communication and room mapping|
US10242449B2|2017-01-04|2019-03-26|Cisco Technology, Inc.|Automated generation of pre-labeled training data|
CN110326310B|2017-01-13|2020-12-29|杜比实验室特许公司|Dynamic equalization for crosstalk cancellation|
JP6791001B2|2017-05-10|2020-11-25|株式会社Jvcケンウッド|Out-of-head localization filter determination system, out-of-head localization filter determination device, out-of-head localization determination method, and program|
EP3410747A1|2017-06-02|2018-12-05|Nokia Technologies Oy|Switching rendering mode based on location data|
CN109286889A|2017-07-21|2019-01-29|华为技术有限公司|A kind of audio-frequency processing method and device, terminal device|
WO2019031652A1|2017-08-10|2019-02-14|엘지전자 주식회사|Three-dimensional audio playing method and playing apparatus|
KR20200071099A|2017-10-17|2020-06-18|매직 립, 인코포레이티드|Mixed reality space audio|
CN108269578B|2018-02-05|2019-10-18|百度在线网络技术(北京)有限公司|Method and apparatus for handling information|
CN112567768A|2018-06-18|2021-03-26|奇跃公司|Spatial audio for interactive audio environments|
CN110677802A|2018-07-03|2020-01-10|百度在线网络技术(北京)有限公司|Method and apparatus for processing audio|
WO2020231883A1|2019-05-15|2020-11-19|Ocelot Laboratories Llc|Separating and rendering voice and ambience signals|
JP2021131434A|2020-02-19|2021-09-09|ヤマハ株式会社|Sound signal processing method and sound signal processing device|
JP2021131432A|2020-02-19|2021-09-09|ヤマハ株式会社|Sound signal processing method and sound signal processing device|
法律状态:
2019-05-14| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2019-06-25| B25D| Requested change of name of applicant approved|Owner name: KONINKLIJKE PHILIPS N.V. (NL) |
2019-07-16| B25G| Requested change of headquarter approved|Owner name: KONINKLIJKE PHILIPS N.V. (NL) |
2019-10-29| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-12-29| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2021-03-09| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 03/01/2012, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
EP11150155.7|2011-01-05|
EP11150155|2011-01-05|
PCT/IB2012/050023|WO2012093352A1|2011-01-05|2012-01-03|An audio system and method of operation therefor|
[返回顶部]